Interdisciplinary Centre for Bioinformatics Working Group 1: Databases and Data Integration – Selected Projects Objectives Annotation Integration / GeneMapper

نویسنده

  • Erhard Rahm
چکیده

‹ Clinical trial groups (lymphoma, sepsis networks) Tight integration (Direct operation on database) File-based exchange (Data ex-/import to/from tools) Transparent integration (Data access using database API) Probe and gene intensities Sample & Experiment annotations Gene annotations Uniform web-based GUI ‹ Existing tools + own developments ¾ArrayAnalyzer server ¾Statistical algorithms(WG3) ¾Canned/Ad-hoc queries ¾(Data Mining, OLAP) selected accession id's mapped accession id's ‹ Recent availability of draft versions of the human and chimpanzee genomes: first example of two closely related mammalian genomes ‹ Goals: Design and implementation of an integrated platform for comparative analysis between humans and chimpanzees ‹ Genome-wide comparison of sequence data, expression level, recombination rates, … ‹ High volume of data: currently approx. 1 TB microarray expression data in flat files from about 200 experiments available ‹ Public sources with annotations refer to different gene represen-‹ Goals: Providing gene-oriented views on annotations by matching between different gene representations ‹ Flexible data management for gene expression analysis based on Affymetrix oligonucleotide arrays ‹ Large amounts of data (around 500 experiment series per year) generated by local user groups ‹ Innovative data warehouse approach: ¾ Multidimensional data organization ¾ Integration of sample/experiment and gene annotation data with expression data ¾ Support for several normalization and aggregation algorithms ¾ Integration of existing analysis tools ‹ Management and analysis of complex molecular-biological data of users for research networks with fast growing amount of data ‹ Design and implementation of flexible databases and analysis platforms for interdisciplinary projects and clinical studies ‹ Database research topics: ¾ Integration of molecular-biological data and metadata (e.g. annotations) ¾ Database coupling / integration of analysis algorithms and tools ¾ Flexible, high performance data organization and querying Current Results ‹ Comparative evaluation of microarray-based gene expression databases showed limitations of previous approaches (BTW2003) ‹ GeWare: Design and implementation of a data warehouse for gene expression analysis; first version of warehouse operational ‹ GeneMapper: Integration of gene annotations from different public sources; first version operational

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Agent-oriented Notification System for Sequence (Re) Annotation in Genomic Databases

Most bioinformatics research projects have adopted database management systems. Each project builds its own database schema to store sequence (re) annotations. However, as the number of bioinformatics projects grows, new database management issues emerge, such as, project management, project collaboration, schema integration, data distribution, data provenance, etc. This work investigates some ...

متن کامل

Graph-based sequence annotation using a data integration approach

The automated annotation of data from high throughput sequencing and genomics experiments is a significant challenge for bioinformatics. Most current approaches rely on sequential pipelines of gene finding and gene function prediction methods that annotate a gene with information from different reference data sources. Each function prediction method contributes evidence supporting a functional ...

متن کامل

Data Integration of Bioinformatics Database Based on Web Services

With the development of human genome projects ( HGP) in the world, a mass of genetic information is generated. Now there are hundreds of different kinds of important bioinformatics databases in the world. How to unify the bioinformatics database from different countries has become an important issue in Bioinformatics. In this paper, we present a data integration program based on Web Services of...

متن کامل

A comprehensive protein-centric ID mapping service for molecular data integration

MOTIVATION Identifier (ID) mapping establishes links between various biological databases and is an essential first step for molecular data integration and functional annotation. ID mapping allows diverse molecular data on genes and proteins to be combined and mapped to functional pathways and ontologies. We have developed comprehensive protein-centric ID mapping services providing mappings for...

متن کامل

Family Classification and Integrative Analysis for Protein Functional Annotation

The high-throughput genome projects have resulted in a rapid accumulation of predicted protein sequences, however, experimentally-verified information on protein function lags far behind. The common approach to inferring function of uncharacterized proteins based on sequence similarity to annotated proteins in sequence databases often results in over-identification, underidentification, or even...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003